Functions of a Matrix

The motivation is to study matrix-valued functions, which stems from the differential equations describing linear systems.

x˙=Ax(t) \large \dot{x} = A x(t)
x(t)=eAtx(0) \large x(t) = e^{At} x(0)

The power series representation of any function is

f(s)=i=0αisi \large f(s) = \sum_{i=0}^{\infty} \alpha_i s^i

where sCs \in \mathbb{C}.

The power series representation of a matrix-valued function is

f(A)=i=0αiAi \large f(A) = \sum_{i=0}^{\infty} \alpha_iA^i

where ACn×nA \in \mathbb{C}^{n \times n}. This solution is another matrix as the same size as AA.

The exponential function is defined for matrices as

et=i=0tii!    eA=i=0Aii! \large e^t = \sum_{i=0}^{\infty} \frac{t^i}{i!} \implies e^A = \sum_{i=0}^{\infty} \frac{A^i}{i!}

By using Cayley-Hamilton Theorem, we can write AiA^i as a linear combination of I,A,A2,,An1I,A,A^2,\cdots,A^{n-1}.

eA=c0I+c1A+c2A2++cn1An1 \large e^A = c_0 I + c_1 A + c_2 A^2 + \cdots + c_{n-1} A^{n-1}

where cic_i are scalars.

Remark: One can use the minimal polynomial of a matrix to express the llth power of a matrix in terms of I,A,A2,,Al1I,A,A^2,\cdots,A^{l-1}. ll is the order of the minimal polynomial.

eA=c0I+c1A+c2A2++cl1Al1 \large e^A = c_0 I + c_1 A + c_2 A^2 + \cdots + c_{l-1} A^{l-1}

First Method

Let

f(s)=i=0αisi f(s) = \sum_{i=0}^{\infty} \alpha_i s^i
f(A)=i=0αiAi f(A) = \sum_{i=0}^{\infty} \alpha_i A^i

Define p(s)p(s) and P(A)P(A) as follows

p(s)=c0+c1s+c2s2++cl1sl1 \large p(s) = c_0 + c_1 s + c_2 s^2 + \cdots + c_{l-1} s^{l-1}
P(A)=c0I+c1A+c2A2++cl1Al1 \large P(A) = c_0 I + c_1 A + c_2 A^2 + \cdots + c_{l-1} A^{l-1}

Then we have the equality

f(A)=P(A) \large f(A) = P(A)

where P(A)P(A) is a polynomial of AA.

Case I: AA is diagonalizable

Suppose

m(s)=(sλ1)(sλ2)(sλσ) \large m(s) = (s - \lambda_1 ) (s - \lambda_2 ) \cdots (s - \lambda_{\sigma} )

l=σ l = \sigma , m1=m2==mσ=1 m_1 = m_2 = \cdots = m_{\sigma} = 1

f(A)=P(A)=c0I+c1A+c2A2++cl1Al1 \large f(A) = P(A) = c_0 I + c_1 A + c_2 A^2 + \cdots + c_{l-1} A^{l-1}

Aei=λiei Ae_i = \lambda_i e_i

Let eie_i be the eigenvector of AA corresponding to λi\lambda_i, multiply both sides by eie_i.

n=0l1αiAnei=n=0l1cnλinei \sum_{n=0}^{l-1} \alpha_i A^n e_i = \sum_{n=0}^{l-1} c_n \lambda_i^n e_i

n=0l1αiλinei=n=0l1cnλinei \sum_{n=0}^{l-1} \alpha_i \lambda_i^n e_i = \sum_{n=0}^{l-1} c_n \lambda_i^n e_i

αiλin=cnλin \alpha_i \lambda_i^n = c_n \lambda_i^n

αi=cn \alpha_i = c_n

f(A)=P(A)=i=0l1αiAi=i=0l1ciAi \large f(A) = P(A) = \sum_{i=0}^{l-1} \alpha_i A^i = \sum_{i=0}^{l-1} c_i A^i

f(λi)=P(λi) \large f(\lambda_i) = P(\lambda_i)


Example: A=[2112]A = \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix} find eAe^A and log(A)log(A)

Solution:

d(s)=(s3)(s1) d(s) = (s - 3)(s - 1)

λ1=3 \lambda_1 = 3 , λ2=1 \lambda_2 = 1

m(s)=(s3)(s1) m(s) = (s - 3)(s - 1) , l=2l = 2

p(s)=c0+c1s p(s) = c_0 + c_1 s , and P(A)=c0I+c1A P(A) = c_0 I + c_1 A

Ae1=3e1 Ae_1 = 3e_1 , Ae2=e2 Ae_2 = e_2

f(3)=p(3)f(1)=p(1) f(3) = p(3) \\ f(1) = p(1)

f(3)=e3=c0+3c1f(1)= e =c0+c1 f(3) = e^3 = c_0 + 3c_1 \\ f(1) = \ e \ = c_0 + c_1

c1=e3e2c0=3ee32 c_1 = \frac{e^3 - e}{2} \\ c_0 = \frac{3e - e^3}{2}

eA=3ee32I+e3e2A e^A = \frac{3e - e^3}{2} I + \frac{e^3 - e}{2} A

eA=3ee32[1001]+e3e2[2112] e^A = \frac{3e - e^3}{2} \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} + \frac{e^3 - e}{2} \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix}


Case II: AA is not diagonalizable

Consider the following example. Let AR3A \in \mathbb{R}^3 and m(s)=(sλ1)2(sλ2) m(s) = (s-\lambda_1)^2(s-\lambda_2) .

J=[λ1100λ1000λ2] \large J = \begin{bmatrix} \lambda_1 & 1 & 0 \\ 0 & \lambda_1 & 0 \\ 0 & 0 & \lambda_2 \end{bmatrix}
mi=2+1=3 \sum m_i = 2 + 1 = 3
f(A)=P(A)=c0I+c1A+c2A2 f(A) = P(A) = c_0 I + c_1 A + c_2 A^2
f(λ1)=P(λ1)   f(λ2)=P(λ2) f(\lambda_1) = P(\lambda_1) \ \ \ f(\lambda_2) = P(\lambda_2)

These two equations are not enough to find c0,c1,c2c_0,c_1,c_2.

Consider the matrix PP that transforms AA into its Jordan canonical form JJ.

We know that P1=PP^{-1} = P^* and PAP=JP^*AP = J, and PP is in the form of [e1,f1,e2][e_1,f_1,e_2]. Where e1e_1 and e2e_2 are the eigenvectors of AA corresponding to λ1\lambda_1 and λ2\lambda_2 respectively. f1f_1 is the generalized eigenvector of AA corresponding to λ1\lambda_1.

P=[e1f1e2] \large P = \begin{bmatrix} \vdots & \vdots & \vdots \\ e_1 & f_1 & e_2 \\ \vdots & \vdots & \vdots \end{bmatrix}
[e1f1e2][λ1100λ1000λ2]=A[e1f1e2] \large \begin{bmatrix} \vdots & \vdots & \vdots \\ e_1 & f_1 & e_2 \\ \vdots & \vdots & \vdots \end{bmatrix} \begin{bmatrix} \lambda_1 & 1 & 0 \\ 0 & \lambda_1 & 0 \\ 0 & 0 & \lambda_2 \end{bmatrix} = A \begin{bmatrix} \vdots & \vdots & \vdots \\ e_1 & f_1 & e_2 \\ \vdots & \vdots & \vdots \end{bmatrix}
Af1 =λ1f1+e1A2f1=λ1Af1+Ae1=λ12f1+λ1e1+λ1e1=λ12f1+2λ1e1A3f1=λ12Af1+2λ1Ae1=λ13f1+3λ12e1   Akf1=λ1kf1+kλ1k1e1 \begin{align*} \large Af_1 \ & = \lambda_1 f_1 + e_1 \\ \large A^2f_1 & = \lambda_1 Af_1 + Ae_1 \\ \large & = \lambda_1^2 f_1 + \lambda_1 e_1 + \lambda_1 e_1 = \lambda_1^2 f_1 + 2 \lambda_1 e_1 \\ \large A^3f_1 & = \lambda_1^2 Af_1 + 2\lambda_1 Ae_1 = \lambda_1^3 f_1 + 3 \large \lambda_1^2 e_1 \\ & \ \ \ \vdots \\ \large A^kf_1 & = \lambda_1^k f_1 + k \lambda_1^{k-1} e_1 \\ \end{align*}

Return to the equation f(A)=P(A)f(A) = P(A).

i=0l1αiAi=i=0l1ciAi \large \sum_{i=0}^{l-1} \alpha_i A^i = \sum_{i=0}^{l-1} c_i A^i

Multiply both sides by f1f_1 from the right.

i=0l1αiAif1=i=0l1ciAif1 \large \sum_{i=0}^{l-1} \alpha_i A^i f_1 = \sum_{i=0}^{l-1} c_i A^i f_1
i=0l1αiλ1if1+i=0l1αiiλ1i1e1=i=0l1ciλ1if1+i=0l1ciiλ1i1e1 \large \sum_{i=0}^{l-1} \alpha_i \lambda_1^i f_1 + \sum_{i=0}^{l-1} \alpha_i i \lambda_1^{i-1} e_1 = \sum_{i=0}^{l-1} c_i \lambda_1^i f_1 + \sum_{i=0}^{l-1} c_i i \lambda_1^{i-1} e_1
f(λ1)f1+f(λ1)e1=P(λ1)f1+P(λ1)e1 \large f(\lambda_1) f_1 + f'(\lambda_1) e_1 = P(\lambda_1) f_1 + P'(\lambda_1) e_1
f(λ1)=P(λ1) \large f(\lambda_1) = P(\lambda_1)
f(λ1)=P(λ1) \large f'(\lambda_1) = P'(\lambda_1)

Which is the additional equation we need to find c0,c1,c2c_0,c_1,c_2.


General Case

Let m(s)=(sλ1)m1(sλ2)m2(sλσ)mσm(s) = (s-\lambda_1)^{m_1}(s-\lambda_2)^{m_2} \cdots (s-\lambda_{\sigma})^{m_{\sigma}}. We have the following set of equations.

f(t)(λi)=P(t)(λi) \large f^{(t)}(\lambda_i) = P^{(t)}(\lambda_i)

where t=0,1,2,,mi1t = 0,1,2,\cdots,m_i-1 and i=1,2,,σi = 1,2,\cdots,\sigma. We have i=1σmi=l\sum_{i=1}^{\sigma} m_i = l equations.


Example: A=[0100000100000000001100001]A = \begin{bmatrix} 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix} Find sin(πA)\sin{(\pi A)}.

Solution:

m(s)=s3(s1)2 m(s) = s^3(s-1)^2 and the order of the total minimal polynomial is l=5l = 5.

p(s)=c0+c1s+c2s2+c3s3+c4s4 p(s) = c_0 + c_1 s + c_2 s^2 + c_3 s^3 + c_4 s^4

p(A)=c0I+c1A+c2A2+c3A3+c4A4 p(A) = c_0 I + c_1 A + c_2 A^2 + c_3 A^3 + c_4 A^4

We need five equations to find c0,c1,c2,c3,c4c_0,c_1,c_2,c_3,c_4.

Starting with

f(s)=sin(πs)f(s)=πcos(πs)f(s)=π2sin(πs)\begin{align*} f(s) &= \sin{(\pi s)} \\ f'(s) &= \pi \cos{(\pi s)} \\ f''(s) &= -\pi^2 \sin{(\pi s)} \end{align*}

Then

p(s)=c0+c1s+c2s2+c3s3+c4s4p(s)=c1+2c2s+3c3s2+4c4s3p(s)=2c2+6c3s+12c4s2 \begin{align*} p(s) &= c_0 + c_1 s + c_2 s^2 + c_3 s^3 + c_4 s^4 \\ p'(s) &= c_1 + 2c_2 s + 3c_3 s^2 + 4c_4 s^3 \\ p''(s) &= 2c_2 + 6c_3 s + 12c_4 s^2\end{align*}

For the first eigenvalue λ1=0\lambda_1 = 0. We have m1=3m_1 = 3.

f(λ1)=p(λ1)f(λ1)=p(λ1)f(λ1)=p(λ1) \begin{align*} & f(\lambda_1) = p(\lambda_1) \\ & f'(\lambda_1) = p'(\lambda_1) \\ & f''(\lambda_1) = p''(\lambda_1) \end{align*}
f(0)=p(0)f(0)=p(0)f(0)=p(0) \begin{align*} & f(0) = p(0) \\ & f'(0) = p'(0) \\ & f''(0) = p''(0) \end{align*}
sin(0)=0=c0πcos(0)=π=c1π2sin(0)=0=2c2 \begin{align*} & \sin{(0)} = 0 = c_0 \\ & \pi \cos{(0)} = \pi = c_1 \\ & -\pi^2 \sin{(0)} = 0 = 2c_2 \end{align*}

For the second eigenvalue λ2=1\lambda_2 = 1. We have m2=2m_2 = 2.

f(λ2)=p(λ2)f(λ2)=p(λ2) \begin{align*} & f(\lambda_2) = p(\lambda_2) \\ & f'(\lambda_2) = p'(\lambda_2) \end{align*}
sin(π)=0=c0+c1+c2+c3+c4πcos(π)=π=c1+2c2+3c3+4c4 \begin{align*} & \sin{(\pi)} = 0 = c_0 + c_1 + c_2 + c_3 + c_4 \\ & \pi \cos{(\pi)} = -\pi = c_1 + 2c_2 + 3c_3 + 4c_4 \end{align*}
[0100000100000000001100001][c0c1c2c3c4]=[0π0ππ] \large \begin{bmatrix} 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} c_0 \\ c_1 \\ c_2 \\ c_3 \\ c_4 \end{bmatrix} = \begin{bmatrix} 0 \\ \pi \\ 0 \\ -\pi \\ -\pi \end{bmatrix}
[c0c1c2c3c4]=[0π02ππ] \large \begin{bmatrix} c_0 \\ c_1 \\ c_2 \\ c_3 \\ c_4 \end{bmatrix} = \begin{bmatrix} 0 \\ \pi \\ 0 \\ -2\pi \\ \pi \end{bmatrix}
p(s)=πs2πs3+πs4 \large p(s) = \pi s - 2\pi s^3 + \pi s^4
sin(πA)=πA2πA3+πA4 \large \sin{(\pi A)} = \pi A - 2\pi A^3 + \pi A^4

Remark: f(A)f(A) does not exist when f(t)(λi)f^{(t)}(\lambda_i) does not exist for some tt and ii.


#EE501 - Linear Systems Theory at METU